Dynamic-CSR: A Format for Dynamic Sparse-Matrix Updates
نویسندگان
چکیده
Sparse matrices are a core component in many numerical simulations, and their efficiency is essential to achieving high performance. Dynamic sparse-matrix allocation (insertion) can benefit a number of problems such as sparse-matrix factorization, sparse-matrix-matrix addition, static analysis (e.g., points-to analysis), computing transitive closure, and other graph algorithms. Existing sparse-matrix formats are poorly designed to handle dynamic updates. The compressed sparse-row (CSR) format is fully compact and must be rebuilt after each new entry. Ellpack (ELL) stores a constant number of entries per row, which allows for efficient insertion and sparse matrix-vector multiplication (SpMV) but is memory inefficient and strictly limits row size. The coordinate (COO) format stores a list of entries and is efficient for both memory use and insertion time; however, it is much less efficient at SpMV. Hybrid ellpack (HYB) compromises by using a combination of ELL and COO but degrades in performance as the COO portion fills up. Rows that use the COO portion require it to be completely traversed during every SpMV operation. In this paper we present dynamic compressed sparse row (DCSR), a sparse matrix format that allows for asynchronous dynamic updates, and extend it to an improved SpMM operation. Through the use of DCSR we demonstrate an improved sparse matrix-matrix multiplication (SpMM) algorithm and evaluate it using algebraic multigrid (AMG). AMG is a multigrid operation in which the hierarchy of operators is created from the matrix itself as opposed to the geometry of the mesh. This operation can be expressed in terms of SpMM, SpMV, and primitive parallel operations. DCSR provides significant performance improvements for SpMM through increased memory efficiency due to asynchronous dynamic updates.
منابع مشابه
Dynamic Sparse-Matrix Allocation on GPUs
Sparse matrices are a core component in many numerical simulations, and their efficiency is essential to achieving high performance. Dynamic sparse-matrix allocation (insertion) can benefit a number of problems such as sparse-matrix factorization, sparse-matrix-matrix addition, static analysis (e.g., points-to analysis), computing transitive closure, and other graph algorithms. Existing sparse-...
متن کاملPerformance aspects of sparse matrix-vector multiplication
Sparse matrix-vector multiplication (shortly SpM×V) is an important building block in algorithms solving sparse systems of linear equations, e.g., FEM. Due to matrix sparsity, the memory access patterns are irregular and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking [...
متن کاملAdaptive Row-grouped CSR Format for Storing of Sparse Matrices on GPU
We present new adaptive format for storing sparse matrices on GPU. We compare it with several other formats including CUSPARSE which is today probably the best choice for processing of sparse matrices on GPU in CUDA. Contrary to CUSPARSE which works with common CSR format, our new format requires conversion. However, multiplication of sparse-matrix and vector is significantly faster for many ma...
متن کاملAn FPGA Drop-In Replacement for Universal Matrix-Vector Multiplication
We present the design and implementation of a universal, single-bitstream library for accelerating matrixvector multiplication using FPGAs. Our library handles multiple matrix encodings ranging from dense to multiple sparse formats. A key novelty in our approach is the introduction of a hardware-optimized sparse matrix representation called Compressed Variable-Length Bit Vector (CVBV), which re...
متن کاملEfficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format
In earlier work we have introduced the “Recursive Sparse Blocks” (RSB) sparse matrix storage scheme oriented towards cache efficient matrix-vector multiplication (SpMV ) and triangular solution (SpSV ) on cache based shared memory parallel computers. Both the transposed (SpMV T ) and symmetric (SymSpMV ) matrix-vector multiply variants are supported. RSB stands for a meta-format: it recursively...
متن کامل